4 research outputs found
Recommended from our members
Computational models of the human visual cortex: on individual differences and ecologically valid input statistics
Perception relies on cortical processes in response to sensory stimuli. Visual input entering the
eyes ascends a cascade of processing steps from the retina to high-level regions of the cortex.
Vision science investigates these transformations that give rise to high-level processing of
visual objects, such as object recognition. In this thesis I investigate computational models
of the human visual cortex with regard to their ability to predict cortical responses to visual
objects. In particular, I describe two factors playing an important role in using deep neural
networks (DNNs) to better understand cortical functioning: the initial weight state and
ecologically more valid input statistics.
In Chapter 1 of this thesis I will introduce relevant literature pertaining to deep neural
networks as a modeling framework for the visual cortex. Next, I will lay out the motivation
for the research questions investigated in this thesis and described in detail in Chapters 2, 3,
and 4.
Chapter 2 focuses on the impact of the initial weight state of a model on its ability
to predict cortical representations. I describe work in which we demonstrate that two
DNN instances identical in every aspect but their initial weights, yield very dissimilar
representations. Relying on single network instances to predict cortical activation patterns
in response to sensory stimuli poses a problem for computational neuroscience: depending
on the initial set of weights the ability to mirror the cortical representations of these stimuli
might vary. Thus, results based on single (“off-the-shelf”) model instances - as commonly
used in computational neuroscience - may not generalize. In contrast, using multiple DNN
instances might alleviate this problem as they allow insights in the variability of a given
model architecture to predict cortical representations. These individual differences between
model instances suggest that to allow results to generalize more easily the model instances
should be treated similar to human experimental participants.
In Chapter 3 I focus on ecologically more valid input statistics (in the form of training
images) aiming to improve a model’s ability to predict cortical representations. The most
successful models of the human visual cortex to date are DNNs trained on object recognition
tasks designed with machine learning goals in mind. However, the image sets used for training
these DNNs are often not ecologically realistic. For example, training on the most-widely used image set in computational neuroscience (ImageNet Large Scale Visual Recognition
Challenge (ILSVRC) 2012) requires the fine-grained distinction of 120 dog breeds, but does
not contain visual object categories encountered frequently in everyday human life (e.g.
woman, man, or child). This suggests that taking into account the human visual experience
when training models of the human visual cortex on a categorization task might help to
predict cortical representations. In this Chapter I describe the creation of a set of images
aimed at mimicking the human visual diet: ecoset. Ecoset contains more than 1.5 million
images from 565 basic level categories and is the largest image set specifically designed for
computational neuroscience to date. Ecoset is freely available to allow the community to test
their own hypotheses of models trained with input statistics matched to the human visual
environment.
In Chapter 4 we build on the results from the previous two Chapters. Using multiple
DNN instances I investigate whether a brain-inspired model architecture (vNet) trained on
ecologically more valid input statistics (ecoset) might improve its ability to predict cortical
representations. I first demonstrate that ecoset might improve an architecture’s ability to
mirror cortical representations. Furthermore, ecoset-trained vNet also outperforms state-ofthe-
art computer vision and computational neuroscience models in terms of mirroring cortical
representations in the human brain. Thus, incorporating biological and ecological aspects,
such as brain-inspired architectural features and ecologically more valid input statistics, into
computational models may yield better predictions of response patterns in the human visual
cortex.
Treating DNN instances similar to human experimental participants and considering
ecological and biological factors for building these DNNs may be an important step towards
better models of the human visual cortex. Such models might allow a better understanding of
the cortical processes underlying high-level vision in the human brain.Cambridge Trust - Vice Chancellor's Award 2015
Cambridge Philosophical Society
MRC Cognition and Brain Sciences Uni
Recurrent neural networks can explain flexible trading of speed and accuracy in biological vision.
Deep feedforward neural network models of vision dominate in both computational neuroscience and engineering. The primate visual system, by contrast, contains abundant recurrent connections. Recurrent signal flow enables recycling of limited computational resources over time, and so might boost the performance of a physically finite brain or model. Here we show: (1) Recurrent convolutional neural network models outperform feedforward convolutional models matched in their number of parameters in large-scale visual recognition tasks on natural images. (2) Setting a confidence threshold, at which recurrent computations terminate and a decision is made, enables flexible trading of speed for accuracy. At a given confidence threshold, the model expends more time and energy on images that are harder to recognise, without requiring additional parameters for deeper computations. (3) The recurrent model's reaction time for an image predicts the human reaction time for the same image better than several parameter-matched and state-of-the-art feedforward models. (4) Across confidence thresholds, the recurrent model emulates the behaviour of feedforward control models in that it achieves the same accuracy at approximately the same computational cost (mean number of floating-point operations). However, the recurrent model can be run longer (higher confidence threshold) and then outperforms parameter-matched feedforward comparison models. These results suggest that recurrent connectivity, a hallmark of biological visual systems, may be essential for understanding the accuracy, flexibility, and dynamics of human visual recognition
Individual differences among deep neural network models.
Deep neural networks (DNNs) excel at visual recognition tasks and are increasingly used as a modeling framework for neural computations in the primate brain. Just like individual brains, each DNN has a unique connectivity and representational profile. Here, we investigate individual differences among DNN instances that arise from varying only the random initialization of the network weights. Using tools typically employed in systems neuroscience, we show that this minimal change in initial conditions prior to training leads to substantial differences in intermediate and higher-level network representations despite similar network-level classification performance. We locate the origins of the effects in an under-constrained alignment of category exemplars, rather than misaligned category centroids. These results call into question the common practice of using single networks to derive insights into neural information processing and rather suggest that computational neuroscientists working with DNNs may need to base their inferences on groups of multiple network instances